Database Query Standard
Overview
This document defines the standard patterns for database queries in the ATOM SaaS backend. Consistent query patterns improve performance, maintainability, and reduce bugs.
Decision Framework
When to Use ORM (SQLAlchemy)
✅ **Use ORM for:**
- Single table queries (CRUD operations)
- Simple relationships (joins with foreign keys)
- Business logic that benefits from object mapping
- Queries that benefit from SQLAlchemy's session management
- Read operations that need relationship loading
**Examples:**
# ✅ GOOD: Simple CRUD
agent = db.query(AgentRegistry).filter(
AgentRegistry.id == agent_id,
AgentRegistry.tenant_id == tenant_id
).first()
# ✅ GOOD: Relationship loading
proposal = db.query(AgentProposal).options(
joinedload(AgentProposal.agent)
).filter(AgentProposal.id == proposal_id).first()
# ✅ GOOD: Create/Update
new_agent = AgentRegistry(
id=str(uuid.uuid4()),
tenant_id=tenant_id,
name=name,
role=role
)
db.add(new_agent)
db.commit()When to Use Raw SQL
✅ **Use Raw SQL for:**
- Complex aggregations (GROUP BY, window functions)
- Performance-critical queries
- Complex joins across many tables
- Queries where ORM overhead is significant
- Bulk operations
- Analytics and reporting
**Examples:**
# ✅ GOOD: Complex aggregation
result = await db.execute(text("""
SELECT
agent_id,
COUNT(*) as total_executions,
AVG(execution_time) as avg_time,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY execution_time) as median_time
FROM agent_executions
WHERE tenant_id = :tenant_id
GROUP BY agent_id
"""), {"tenant_id": tenant_id})
# ✅ GOOD: Performance-critical query
agents = await db.execute(text("""
SELECT id, name, role, status
FROM agent_registry
WHERE tenant_id = :tenant_id
AND enabled = true
AND status = :status
ORDER BY confidence_score DESC
LIMIT :limit
"""), {"tenant_id": tenant_id, "status": "autonomous", "limit": 10})Standard Patterns
1. Tenant Isolation (CRITICAL)
**ALWAYS filter by tenant_id in queries:**
# ✅ GOOD: Explicit tenant filtering
result = await db.execute(
select(AgentRegistry).where(
AgentRegistry.tenant_id == tenant_id
)
)
# ❌ BAD: Missing tenant filter
result = await db.execute(
select(AgentRegistry)
)2. Async Queries
**Use async/await for database operations:**
# ✅ GOOD: Async query
result = await db.execute(
select(AgentRegistry).where(AgentRegistry.id == agent_id)
)
agent = result.scalar_one_or_none()
# ❌ BAD: Sync query in async context
agent = db.query(AgentRegistry).filter(AgentRegistry.id == agent_id).first()3. Query Helpers
**Use helper functions for common patterns:**
# ✅ GOOD: Using helper
agent = await get_agent_by_id(db, agent_id, tenant_id)
# Instead of:
result = await db.execute(
select(AgentRegistry).where(
AgentRegistry.id == agent_id,
AgentRegistry.tenant_id == tenant_id
)
)
agent = result.scalar_one_or_none()4. Error Handling
**Handle missing results gracefully:**
# ✅ GOOD: Handle missing data
result = await db.execute(
select(AgentRegistry).where(AgentRegistry.id == agent_id)
)
agent = result.scalar_one_or_none()
if not agent:
raise ValueError(f"Agent {agent_id} not found")
# ❌ BAD: Crash on missing data
agent = result.scalar_one() # Raises NoResultFound5. Bulk Operations
**Use bulk operations for performance:**
# ✅ GOOD: Bulk insert
await db.execute(
insert(AgentProposal).values([
{
'id': str(uuid.uuid4()),
'agent_id': agent_id,
'tenant_id': tenant_id,
# ... other fields
}
for _ in range(100)
])
)
await db.commit()
# ❌ BAD: Loop insert
for i in range(100):
proposal = AgentProposal(...)
db.add(proposal)
await db.commit() # Very slowQuery Helpers
Common Query Patterns
from typing import Optional, List
from sqlalchemy import select, and_
from sqlalchemy.ext.asyncio import AsyncSession
async def get_by_id(
db: AsyncSession,
model,
id: str,
tenant_id: str
):
"""Get a record by ID with tenant isolation."""
result = await db.execute(
select(model).where(
and_(
model.id == id,
model.tenant_id == tenant_id
)
)
)
return result.scalar_one_or_none()
async def list_with_filters(
db: AsyncSession,
model,
tenant_id: str,
filters: dict = None,
limit: int = 50,
offset: int = 0
) -> List:
"""List records with filters and pagination."""
query = select(model).where(model.tenant_id == tenant_id)
if filters:
for key, value in filters.items():
if hasattr(model, key):
query = query.where(getattr(model, key) == value)
query = query.limit(limit).offset(offset)
result = await db.execute(query)
return result.scalars().all()
async def exists(
db: AsyncSession,
model,
tenant_id: str,
**filters
) -> bool:
"""Check if a record exists."""
conditions = [model.tenant_id == tenant_id]
for key, value in filters.items():
if hasattr(model, key):
conditions.append(getattr(model, key) == value)
result = await db.execute(
select(model.id).where(and_(*conditions)).limit(1)
)
return result.scalar_one_or_none() is not NonePerformance Guidelines
1. Use Indexes
**Create indexes for frequently queried columns:**
# In your model
class AgentRegistry(Base):
__tablename__ = "agent_registry"
id = Column(String, primary_key=True)
tenant_id = Column(String, ForeignKey("tenants.id"), index=True) # ✅ Indexed
status = Column(String, index=True) # ✅ Indexed
category = Column(String, index=True) # ✅ Indexed2. Use EXPLAIN ANALYZE
**Analyze slow queries:**
# For debugging
result = await db.execute(text("""
EXPLAIN ANALYZE
SELECT * FROM agent_registry
WHERE tenant_id = :tenant_id
AND status = :status
"""), {"tenant_id": tenant_id, "status": "autonomous"})
print(result.all())3. Avoid N+1 Queries
**Use eager loading for relationships:**
# ✅ GOOD: Eager loading
from sqlalchemy.orm import joinedload
proposals = await db.execute(
select(AgentProposal).options(
joinedload(AgentProposal.agent)
).where(AgentProposal.tenant_id == tenant_id)
)
# ❌ BAD: N+1 queries
proposals = await db.execute(
select(AgentProposal).where(AgentProposal.tenant_id == tenant_id)
)
for proposal in proposals:
# This triggers a separate query for each proposal!
agent = proposal.agent4. Use Pagination
**Always paginate list queries:**
# ✅ GOOD: Paginated
result = await db.execute(
select(AgentRegistry)
.where(AgentRegistry.tenant_id == tenant_id)
.limit(50)
.offset(0)
)
# ❌ BAD: No pagination (could return millions of rows)
result = await db.execute(
select(AgentRegistry).where(AgentRegistry.tenant_id == tenant_id)
)Migration Guide
Before (Mixed Patterns)
# Some use ORM
agent = db.query(AgentRegistry).filter(
AgentRegistry.id == agent_id
).first()
# Others use raw SQL
result = db.execute(
"SELECT * FROM agents WHERE id = :id",
{"id": agent_id}
)
agent = result.fetchone()
# Others use async ORM
result = await db.execute(
select(AgentRegistry).where(AgentRegistry.id == agent_id)
)
agent = result.scalar_one_or_none()After (Standardized)
# Always use async ORM for simple queries
agent = await get_agent_by_id(db, agent_id, tenant_id)
# Always use raw SQL for complex queries
agents = await db.execute(text("""
SELECT ...complex aggregation...
"""), {"tenant_id": tenant_id})Testing
Test Query Performance
import time
def test_query_performance():
start = time.time()
result = await db.execute(
select(AgentRegistry).where(AgentRegistry.tenant_id == tenant_id)
)
agents = result.scalars().all()
elapsed = time.time() - start
assert elapsed < 0.1, f"Query too slow: {elapsed}s"
assert len(agents) <= 100, "Too many results (use pagination)"Test Tenant Isolation
def test_tenant_isolation():
# Should not see other tenants' data
agents = await db.execute(
select(AgentRegistry).where(AgentRegistry.tenant_id == tenant_id)
)
for agent in agents:
assert agent.tenant_id == tenant_id, "Tenant isolation violated!"Monitoring
Track Query Performance
- Log slow queries (>100ms)
- Monitor query frequency
- Track query patterns
- Alert on N+1 queries
Metrics to Track
- Average query time
- Query frequency
- Rows returned per query
- Index usage
Best Practices Summary
- ✅ Always use async/await for database operations
- ✅ Always filter by tenant_id
- ✅ Use ORM for simple queries (CRUD)
- ✅ Use raw SQL for complex queries (aggregations, analytics)
- ✅ Use query helpers for common patterns
- ✅ Always paginate list queries
- ✅ Use eager loading to avoid N+1 queries
- ✅ Create indexes for frequently queried columns
- ✅ Use EXPLAIN ANALYZE for slow queries
- ✅ Handle missing data gracefully
References
- SQLAlchemy Documentation: https://docs.sqlalchemy.org/
- PostgreSQL Performance: https://www.postgresql.org/docs/current/performance-tips.html
- Query Helpers:
backend-saas/core/query_helpers.py(to be created)
Changelog
- 2026-02-08: Initial standard created
- 2026-02-08: Decision framework documented
- 2026-02-08: Common patterns defined